Temperature-Aware Modeling and Banking of IC Lifetime Reliability

نویسندگان

  • Zhijian Lu
  • John Lach
  • Mircea Stan
  • Kevin Skadron
چکیده

Most existing integrated circuit (IC) reliability models assume a uniform, typically worst-case, operating temperature, but temporal and spatial temperature variations affect expected device lifetime. As a result, design decisions and dynamic thermal management (DTM) techniques using worst-case models are pessimistic and result in excessive design margins and unnecessary runtime engagement of cooling mechanisms (and associated performance penalties). By leveraging a reliability model that accounts for temperature gradients (dramatically improving interconnect lifetime prediction accuracy) and modeling expected lifetime as a resource that is consumed over time at a temperatureand voltage-dependent rate, substantial design margin can be reclaimed and runtime penalties avoided while meeting expected lifetime requirements. In this paper, we evaluate the potential benefits and implementations of this technique by tracking the expected lifetime of a system under different workloads while accounting for the impact of dynamic voltage and temperature variations. Simulation results show that our dynamic reliability management (DRM) techniques provide a 40% performance penalty reduction over that incurred by pessimistic DTM in general-purpose computing and a 10% increase in quality of service (QoS) for servers, all while preserving the expected IC lifetime reliability.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Interconnect Lifetime Prediction with Temporal and Spatial Temperature Gradients for Reliability-Aware Design and Runtime Management: Modeling and Applications

Thermal effects are becoming a limiting factor in high performance circuit design due to the strong temperature dependence of leakage power, circuit performance, IC package cost and reliability. While many interconnect reliability models assume a constant temperature, this paper analyzes the effects of temporal and spatial thermal gradients on interconnect lifetime in terms of electromigration....

متن کامل

The Importance of Temporal and Spatial Temperature Gradients in IC Reliability Analysis

Existing IC reliability models assume a uniform, typically worst-case, operating temperature, but temporal and spatial temperature variations affect expected device lifetime. This paper presents a model that accounts for temperature gradients, dramatically improving interconnect and gate-oxide lifetime prediction accuracy. By modeling expected lifetime as a resource that is consumed over time a...

متن کامل

Managing Reliability of Integrated Circuits: Lifetime Metering and Design for Healing

In nanoscale CMOS devices, manufacturing control, process variations and device reliability have emerged as dominant concerns. Reliability problems manifest over lifetime of ICs in multiple ways, from gradual degradation in performance to increasing leakage current to catastrophic failures. Each of these effects are modulated by workloads executing on ICs and ambient conditions which influence ...

متن کامل

Analysis of Temporal and Spatial Temperature Gradients for IC Reliability

One of the most common causes of IC failure is interconnect electromigration (EM), which exhibits a rate that is exponentially dependent on temperature. As a result, EM rate is one of the major determinants of the maximum tolerable operating temperature for an IC and of resulting cooling costs. Previous EM models have assumed a uniform, typically worst-case, temperature. This paper presents a m...

متن کامل

Interconnect Lifetime Prediction for Temperature - Aware Design

Thermal effects are becoming a limiting factor in high-performance circuit design due to the strong temperature-dependence of leakage power, circuit performance, IC package cost and reliability. Temperature-aware design tries to maximize performance under a given thermal envelope through various static and dynamic approaches. While existing interconnect reliability models assume a constant temp...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005